Query Optimization in Distributed Database Systems

نویسندگان

  • Robert Taylor
  • Dan Olteanu
چکیده

The query optimizer is widely considered to be the most important component of a database management system. It is responsible for taking a user query and searching through the entire space of equivalent execution plans for a given user query and returning the execution plan with the lowest cost. This plan can then be passed to the executer, which can carry out the query. Plans can vary significantly in cost therefore it is important for the optimizer to avoid very bad plans. In this thesis we consider queries in positive relational algrebra form involving the conjunction of projections, selections and joins. The query optimization problem faced by everyday query optimizers gets more and more complex with the ever increasing complexity of user queries. The NP-hard join ordering problem is a central problem that an optimizer must deal with in order to produce optimal plans. Fairly small queries, involving less than 10 relations, can be handled by existing algorithms such as the classic Dynamic Programming optimization algorithm. However, for complex queries or queries involving multiple execution sites in a distributed setting the optimization problem becomes much more challenging and existing optimization algorithms find it difficult to cope with the complexity. In this thesis we present a cost model that allows inter-operator parallelism opportunities to be identified within query execution plans. This allows the response time of a query to be estimated more accurately. We merge two existing centralized optimization algorithms DPccp and IDP1 to create a practically more efficient algorithm IDP1ccp. We propose the novel Multilevel optimization algorithm framework that combines heuristics with existing centralized optimization algorithms. The distributed multilevel optimization algorithm (DistML) proposed in this paper uses the idea of distributing the optimization phase across multiple optimization sites in order to fully utilize the available system resources.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Heuristics and Genetic Algorithms for Large-scale Database Query Optimization

Distributed database system technology is one of the major developments in information technology area. It will continue to have a very significant impact on data processing in the upcoming years because distributed database systems have many potential advantages over centralized systems for geographically distributed organizations. The continuing interest in distributed database systems in the...

متن کامل

A Query Optimization Strategy for Autonomous Distributed Database Systems

Distributed database is a collection of logically related databases that cooperate in a transparent manner. Query processing uses a communication network for transmitting data between sites. It refers to one of the challenges in the database world. The development of sophisticated query optimization technology is the reason for the commercial success of database systems, which complexity and co...

متن کامل

Query Optimization Concepts in Distributed Database

Query optimization is an important part of database management system. In this paper, through the research on query optimization technology, based on a number of optimization algorithms commonly used in distributed query, It aims to arrive at an optimal query processing plan for a given distributed query. As per the approach, the query plans having the required data residing close to each other...

متن کامل

Relational Databases Query Optimization using Hybrid Evolutionary Algorithm

Optimizing the database queries is one of hard research problems. Exhaustive search techniques like dynamic programming is suitable for queries with a few relations, but by increasing the number of relations in query, much use of memory and processing is needed, and the use of these methods is not suitable, so we have to use random and evolutionary methods. The use of evolutionary methods, beca...

متن کامل

Separating indexes from data: a distributed scheme for secure database outsourcing

Database outsourcing is an idea to eliminate the burden of database management from organizations. Since data is a critical asset of organizations, preserving its privacy from outside adversary and untrusted server should be warranted. In this paper, we present a distributed scheme based on storing shares of data on different servers and separating indexes from data on a distinct server. Shamir...

متن کامل

Optimization of Queries in Distributed Database Management System

The query optimizer is widely considered to be the most important component of a database management system. it is a process of producing an optimal (close to optimal) query execution plan which represents an execution strategy for the query. It is responsible for taking a user query and searching through the entire space of equivalent execution plans for a given user query and returning the ex...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009